Monday, April 28, 2014
While I was at Intent Media I led the data
engineering team in rebuilding and extending the Intent Media data platform. To structure and
simplify queries we relied on Cascalog, a Clojure
DSL built on top of the Cascading library that is
built on top of Apache Hadoop.
Cascalog is inspired by Datalog and
uses logic programming
to simplify query expression. It is similar to
Datomic for Clojure and the recent
DataScript for ClojureScript. This
allows simple and concise queries, e.g. to compute the average age per country:
(?<- (stdout) [?country ?avg]
(location ?person ?country _ _) (age ?person ?age)
(c/count ?count) (c/sum ?age :> ?sum)
(div ?sum ?count :> ?avg))
Jon Sondag, a data scientist at Intent Media, recently gave a presentation at
the NYC Clojure Meetup about Cascalog in production. His slides are embedded
below.
It is great to see Cascalog being used in production data platforms.
Wednesday, March 12, 2014
Ona, a company I co-founded, recently built the tallying
software used to aggregate votes in the Libyan
constitutional assembly elections. These votes were cast
throughout the country, on off-shore oil rigs, and at international voting
centers throughout the world.
The Libyan High National Election Commission has generously made
the tally system software open source. All application source code is on
github, there is an
overview of
the tallying process, and additional code documentation. A description of the
technologies used is posted on the Ona blog.
Friday, January 17, 2014
Benedetta Simeonidis,
Roger Wong, and I gave a lecture on Wednesday
discussing mobile technologies and their intersection with global health.
We demoed
Formhub and talked about
Drishti. We also talked
about the importance of user centric design in mobile technology.
The slides from our lecture are below:
Wednesday, November 27, 2013
Ivan Willig and I gave a presentation on Monday
introducing Clojure and discussing some of the production Clojure code in the
Intent Media data platform.
The full presentation is available on
github and as interactive
slides.
Sunday, November 10, 2013
We recently released a Helioid API that returns categorized search results.
To retrieve JSON results for a
query like “data
analytics” simple append
“?format=json
” to the URL, i.e.
[http://www.helioid.com/searches/q/data+analytics?format=json]
(http://www.helioid.com/searches/q/data+analytics?format=json)
To make this easier to use we have released open source
Ruby and
Clojure client libraries.
Install the Ruby library with:
then load and fetch categories using:
require 'heliapi'
results = Heliapi.new.web('ruby apis')
results['categories'].keys
which returns:
=> ['Developer',
'Access',
'Provides',
'Rails',
'Building',
'Install',
'Google Api Ruby',
'Ruby Client'
]
To install the Clojure library add heliapi
to your Leiningen project.clj
file:
then load and fetch categories with:
(:require [heliapi.core :as helioid])
(map #(:name %)
(:categories (helioid/web "helioid")))
which returns the results as:
=> ("search refinement"
"search engine"
"results"
"helioid choroiditis"
"intranuclear helioid inclusions"
"intranuclear helioid"
"new"
"helioid search")
We will add features to the API and client libraries as requested. We will
also make libraries for other languages as requested.